-
Notifications
You must be signed in to change notification settings - Fork 438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propagate resource to exporters #706
Conversation
Codecov Report
@@ Coverage Diff @@
## main #706 +/- ##
==========================================
+ Coverage 94.72% 94.77% +0.04%
==========================================
Files 216 217 +1
Lines 9900 9928 +28
==========================================
+ Hits 9378 9409 +31
+ Misses 522 519 -3
|
@@ -244,6 +259,7 @@ class SpanData final : public Recordable | |||
std::vector<SpanDataEvent> events_; | |||
std::vector<SpanDataLink> links_; | |||
opentelemetry::trace::SpanKind span_kind_{opentelemetry::trace::SpanKind::kInternal}; | |||
opentelemetry::sdk::resource::ResourceAttributes resourceAttributes_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this used for? Is it worth copying the resource attributes in every span?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for noticing it. Will make it reference/pointer to resource :)
Generally, LGTM. |
@@ -67,6 +71,7 @@ class Recordable final : public sdk::trace::Recordable | |||
|
|||
private: | |||
ZipkinSpan span_; | |||
std::string service_name_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking here about some optimization opportunity: how often do you expect service_name
value to change, i.e. do you have to copy it here for every recordable, then again into every JSON
(ZipkinSpan) object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can think of some optimization of not passing resource and instrumentation library as part of every Recordable( and instead just passing tracer reference from where we can get both), but still, these data need to be copied to every ZipkinSpan. ( https://zipkin.io/zipkin-api/#/default/post_spans ).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine for now. I am looking at Zipkin exporter as example for Fluentd exporter, I can share some thoughts how I would've rearranged this in the Fluentd exporter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, will look forward to that :)
@@ -213,6 +213,16 @@ void Recordable::SetName(nostd::string_view name) noexcept | |||
span_["name"] = name.data(); | |||
} | |||
|
|||
void Recordable::SetResource(const opentelemetry::sdk::resource::Resource &resource) noexcept |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirming re. thread-safety of SetResource
vs GetServiceName
-- both cannot be called at once from two different threads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SetResource
would be called when Span is created using Tracer::StartSpan()
. And GetServiceName()
would be when Span is ending : Span::End() -> Processor::OnEnd() -> Exporter::Export()
. Both these need to be sequential in same thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would let it go for now. But we may need to describe some of these expectations where we do not explicitly shield the access with a mutex, esp. in the other place where we return an object (that may get changed in another thread) by reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Specifically for Resources, these are immutable as per the specs, with their ownership transferred to TracerProvider while it's initialization. So there won't be race condition arising while accessing them within the pipeline.
@@ -50,6 +50,9 @@ class Tracer final : public trace_api::Tracer, public std::enable_shared_from_th | |||
return *instrumentation_library_; | |||
} | |||
|
|||
/** Returns the currently configured resource **/ | |||
const opentelemetry::sdk::resource::Resource &GetResource() { return context_->GetResource(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No setter - SetResource
in API, only getter - GetResource
? What are the thread-safety guarantees of this method on API surface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The resource is a property of TracerProvider
and should be set during the initialization of TracerProvider in the main thread. It's not re-settable again once initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approved.
From a practical standpoint, however, we do have scenarios as follows:
-
whatever Monitoring Agent is upgraded, OR remote configuration is upgraded; so your resource values changed...
-
your
resource
contents now may need to change, e.g. updated configuration of theTracerProvider
- to indicate that a new configuration is now applied. Without application service restart. Your threads continue running, while accessing resource, while the resource object is maybe changing at the same time.. which may lead to crash.
In those cases, I think, we may need to consider how to make some of this thread-safe.
Current logic of returning Resource
by reference in SDK may break if Resource
needs to change in a different thread. I am not sure how to track my concern. Maybe we need to have a generic umbrella issue that captures thread-safety guarantees at both levels?
- API thread safety guarantees
- SDK thread safety guarantees
In a separate markdown document, dev-focused document for exporter developers.. Or when you return a reference, you can check where else that object may be concurrently modified, and explain why it should not be concurrently modified elsewhere. Such as explicitly spell out the thread-safety guarantees on that object that you returned by reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- your
resource
contents now may need to change, e.g. updated configuration of theTracerProvider
- to indicate that a new configuration is now applied. Without application service restart. Your threads continue running, while accessing resource, while the resource object is maybe changing at the same time.. which may lead to crash.
The ownership model of the Resource object is well defined. Once created, the ownership of the Resource object is transferred to TracerProvider, and there is no setters to update it at runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also to add, as per the specs : https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/resource/sdk.md
When used with distributed tracing, a resource can be associated with the TracerProvider when the TracerProvider is created. That association cannot be changed later. When associated with a TracerProvider, all Spans produced by any Tracer from the provider MUST be associated with this Resource.
The resource once set can't be changed later.
exporters/otlp/include/opentelemetry/exporters/otlp/recordable.h
Outdated
Show resolved
Hide resolved
@@ -225,6 +232,11 @@ class SpanData final : public Recordable | |||
span_kind_ = span_kind; | |||
} | |||
|
|||
void SetResource(const opentelemetry::sdk::resource::Resource &resource) noexcept override | |||
{ | |||
resource_ = &resource; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lalitb - this is problematic spot. It won't work too well with temporaries because it's assigning a pointer to reference. If Resource
is compiled with std::variant
, and a temporary is passed to API, e.g. SetResource("key", "value")
- it'd break..
Fixes #334
Changes
Please provide a brief description of the changes here.
For significant contributions please make sure you have completed the following items:
CHANGELOG.md
updated for non-trivial changes